image patch
Learning Parametric Sparse Models for Image Super-Resolution
Learning accurate prior knowledge of natural images is of great importance for single image super-resolution (SR). Existing SR methods either learn the prior from the low/high-resolution patch pairs or estimate the prior models from the input low-resolution (LR) image. Specifically, high-frequency details are learned in the former methods. Though effective, they are heuristic and have limitations in dealing with blurred LR images; while the latter suffers from the limitations of frequency aliasing. In this paper, we propose to combine those two lines of ideas for image super-resolution. More specifically, the parametric sparse prior of the desirable high-resolution (HR) image patches are learned from both the input low-resolution (LR) image and a training image dataset. With the learned sparse priors, the sparse codes and thus the HR image patches can be accurately recovered by solving a sparse coding problem. Experimental results show that the proposed SR method outperforms existing state-of-the-art methods in terms of both subjective and objective image qualities.
- Africa > Senegal > Kolda Region > Kolda (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > New Jersey > Bergen County > Mahwah (0.04)
- (2 more...)
- North America > Canada > Ontario > Toronto (0.14)
- North America > Canada > Quebec (0.04)
- Europe > Netherlands > South Holland > Leiden (0.04)
- North America > United States > Maine (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (2 more...)
- Research Report > Experimental Study (0.46)
- Research Report > New Finding (0.46)
- Health & Medicine (0.46)
- Transportation > Passenger (0.46)
- Transportation > Ground > Road (0.46)
- (2 more...)
SupplementaryMaterial: UnifiedVision-Language Pre-TrainingwithMixture-of-Modality-Experts
We perform finetuning with image-textcontrastiveand image-textmatching losses. During inference, VLMO is first used as a dual encoder to obtain top-k candidates, then the model is used as a fusionencoder torerankthecandidates. For the text-only pre-training data, we use English Wikipedia and BookCorpus [5]. Table 1: Ablation study of the shared self-attention module used in Multiway Transformer.
- North America > United States > Virginia > Albemarle County > Charlottesville (0.04)
- North America > United States > California > Santa Clara County > San Jose (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
- Asia > Japan (0.04)
- South America > Venezuela (0.04)
- South America > Bolivia (0.04)
- (6 more...)